Multicore-optimized wavefront diamond blocking for optimizing stencil updates
نویسندگان
چکیده
The importance of stencil-based algorithms in computational science has focused attention on optimized parallel implementations for multilevel cache-based processors. Temporal blocking schemes leverage the large bandwidth and low latency of caches to accelerate stencil updates and approach theoretical peak performance. A key ingredient is the reduction of data traffic across slow data paths, especially the main memory interface. In this work we combine the ideas of multicore wavefront temporal blocking and diamond tiling to arrive at stencil update schemes that show large reductions in memory pressure compared to existing approaches. The resulting schemes show performance advantages in bandwidth-starved situations, which are exacerbated by the high bytes per lattice update case of variable coefficients. Our thread groups concept provides a controllable trade-off between concurrency and memory usage, shifting the pressure between the memory interface and the CPU. We present performance results on a contemporary Intel processor.
منابع مشابه
Optimizing Stencil Computations: Multicore-optimized Wavefront Diamond Blocking on Shared and Distributed Memory Systems
Iterative Stencil Computations (ISC) appear in wide variety of scientific applications, partial differential equation (PDE) solvers being the most important one. In iterative stencil computations, each point in a multi-dimensional spatial grid is updated using weighted contributions from its neighbor points, defined by the stencil operator. The stencil operator specifies the relative coordinate...
متن کاملTowards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking
We study the impact of tunable parameters on computational intensity (i.e., inverse code balance) and energy consumption of multicore-optimized wavefront diamond temporal blocking (MWD) applied to different stencil-based update schemes. MWD combines the concepts of diamond tiling and multicore-aware wavefront blocking in order to achieve lower cache size requirements than standard singlecore wa...
متن کاملUniversity of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Diamond Tiling: A Tiling Framework for Time-iterated Scientific Applications
This paper fully develops Diamond Tiling, a technique to partition the computations of stencil applications such as FDTD. The Diamond Tiling technique is the result of optimizing the amount of useful computations that can be executed when a region of memory is loaded to the local memory of a multiprocessor chip. Diamond Tiling contributes to the state of the art on time tiling techniques in tha...
متن کاملEfficient multicore-aware parallelization strategies for iterative stencil computations
Stencil computations consume a major part of runtime in many scientific simulation codes. As prototypes for this class of algorithms we consider the iterative Jacobi and Gauss-Seidel smoothers and aim at highly efficient parallel implementations for cachebased multicore architectures. Temporal cache blocking is a known advanced optimization technique, which can reduce the pressure on the memory...
متن کاملHigh Performance Stencil Code Algorithms for GPGPUs
In this paper we investigate how stencil computations can be implemented on state-of-the-art general purpose graphics processing units (GPGPUs). Stencil codes can be found at the core of many numerical solvers and physical simulation codes and are therefore of particular interest to scientific computing research. GPGPUs have gained a lot of attention recently because of their superior floating ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- SIAM J. Scientific Computing
دوره 37 شماره
صفحات -
تاریخ انتشار 2015